{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Apply Phi model with HuggingFace Causal ML"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![HuggingFace Logo](https://huggingface.co/front/assets/huggingface_logo-noborder.svg)\n",
    "\n",
    "**HuggingFace** is a popular open-source platform that develops computation tools for building application using machine learning. It is widely known for its Transformers library which contains open-source implementation of transformer models for text, image, and audio task.\n",
    "\n",
    "[**Phi 3**](https://azure.microsoft.com/en-us/blog/introducing-phi-3-redefining-whats-possible-with-slms/) is a family of AI models developed by Microsoft, designed to redefine what is possible with small language models (SLMs). Phi-3 models are the most compatable and cost-effective SLMs, [outperforming models of the same size and even larger ones in language](https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/?msockid=26355e446adb6dfa06484f956b686c27), reasoning, coding, and math benchmarks. \n",
    "\n",
    "![Phi 3 model performance](https://mmlspark.blob.core.windows.net/graphics/The-Phi-3-small-language-models-with-big-potential-1.jpg)\n",
    "\n",
    "To make it easier to scale up causal language model prediction on a large dataset, we have integrated [HuggingFace Causal LM](https://huggingface.co/docs/transformers/tasks/language_modeling) with SynapseML. This integration makes it easy to use the Apache Spark distributed computing framework to process large data on text generation tasks.\n",
    "\n",
    "This tutorial shows hot to apply [phi3 model](https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3) at scale with no extra setting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %pip install --upgrade transformers==4.49.0 -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chats = [\n",
    "    (1, \"fix grammar: helol mi friend\"),\n",
    "    (2, \"What is HuggingFace\"),\n",
    "    (3, \"translate to Spanish: hello\"),\n",
    "]\n",
    "\n",
    "chat_df = spark.createDataFrame(chats, [\"row_index\", \"content\"])\n",
    "chat_df.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define and Apply Phi3 model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following example demonstrates how to load the remote Phi 3 model from HuggingFace and apply it to chats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from synapse.ml.hf import HuggingFaceCausalLM\n",
    "\n",
    "phi3_transformer = (\n",
    "    HuggingFaceCausalLM()\n",
    "    .setModelName(\"microsoft/Phi-3-mini-4k-instruct\")\n",
    "    .setInputCol(\"content\")\n",
    "    .setOutputCol(\"result\")\n",
    "    .setModelParam(max_new_tokens=100)\n",
    ")\n",
    "result_df = phi3_transformer.transform(chat_df).collect()\n",
    "display(result_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Apply Chat Template"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyspark.sql.functions import udf\n",
    "from pyspark.sql.types import ArrayType, MapType, StringType\n",
    "\n",
    "reviews = [\n",
    "    (1, \"I like SynapseML\"),\n",
    "    (2, \"Contoso is awful\"),\n",
    "]\n",
    "reviews_df = spark.createDataFrame(reviews, [\"row_index\", \"content\"])\n",
    "\n",
    "PROMPT_1 = f\"\"\"You are an AI assistant that identifies the sentiment of a given text. Respond with only the single word “positive” or “negative.”\n",
    "        \"\"\"\n",
    "\n",
    "\n",
    "@udf\n",
    "def make_template(s: str):\n",
    "    return [{\"role\": \"system\", \"content\": PROMPT_1}, {\"role\": \"user\", \"content\": s}]\n",
    "\n",
    "\n",
    "reviews_df = reviews_df.withColumn(\"messages\", make_template(\"content\"))\n",
    "\n",
    "phi3_transformer = (\n",
    "    HuggingFaceCausalLM()\n",
    "    .setModelName(\"microsoft/Phi-3-mini-4k-instruct\")\n",
    "    .setInputCol(\"messages\")\n",
    "    .setOutputCol(\"result\")\n",
    "    .setModelParam(max_new_tokens=10)\n",
    ")\n",
    "result_df = phi3_transformer.transform(reviews_df).collect()\n",
    "display(result_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use local cache"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By caching the model, you can reduce initialization time. On Fabric, store the model in a Lakehouse and use setCachePath to load it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %%sh\n",
    "# azcopy copy \"https://mmlspark.blob.core.windows.net/huggingface/microsoft/Phi-3-mini-4k-instruct\" \"/lakehouse/default/Files/microsoft/\" --recursive=true"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# phi3_transformer = (\n",
    "#     HuggingFaceCausalLM()\n",
    "#     .setCachePath(\"/lakehouse/default/Files/microsoft/Phi-3-mini-4k-instruct\")\n",
    "#     .setInputCol(\"content\")\n",
    "#     .setOutputCol(\"result\")\n",
    "#     .setModelParam(max_new_tokens=1000)\n",
    "# )\n",
    "# result_df = phi3_transformer.transform(chat_df).collect()\n",
    "# display(result_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Utilize GPU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To utilize GPU, passing device_map=\"cuda\", torch_dtype=\"auto\" to modelConfig."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "phi3_transformer = (\n",
    "    HuggingFaceCausalLM()\n",
    "    .setModelName(\"microsoft/Phi-3-mini-4k-instruct\")\n",
    "    .setInputCol(\"content\")\n",
    "    .setOutputCol(\"result\")\n",
    "    .setModelParam(max_new_tokens=100)\n",
    "    .setModelConfig(\n",
    "        device_map=\"cuda\",\n",
    "        torch_dtype=\"auto\",\n",
    "    )\n",
    ")\n",
    "result_df = phi3_transformer.transform(chat_df).collect()\n",
    "display(result_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phi 4\n",
    "\n",
    "To try with the newer version of phi 4 model, simply set the model name to be microsoft/Phi-4-mini-instruct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "phi4_transformer = (\n",
    "    HuggingFaceCausalLM()\n",
    "    .setModelName(\"microsoft/Phi-4-mini-instruct\")\n",
    "    .setInputCol(\"content\")\n",
    "    .setOutputCol(\"result\")\n",
    "    .setModelParam(max_new_tokens=100)\n",
    "    .setModelConfig(\n",
    "        device_map=\"auto\",\n",
    "        torch_dtype=\"auto\",\n",
    "        local_files_only=False,\n",
    "        trust_remote_code=True,\n",
    "    )\n",
    ")\n",
    "result_df = phi4_transformer.transform(chat_df).collect()\n",
    "display(result_df)"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
